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Sequences of relevant genetic elements 
>5BC_Cassette (synthesized by IDT) 

GCTTTACCTCGCACTGCCCAGAGTGACATGTTTGCGAACCTCATCACTCGTTGCGGCAAT 
ACTTTCGTGCCAATCCGGTACGTGGTGTATGCAAGGGAAATTAACGGACGGCCTCAATTC 
CTGCAAGGTAACGCACCGCGGCCCAAGAGCCATTAAGCGATATAATCGCACATCTGGCCA 
ACCCGCCACGTACCGGATTGGCACGCTTGCAGAGAATCCTGGGCTCTTCAACCAACAGTG 
ACGGGGGCTTATTAGGAGGATTTTGATACGGACGCGCAACCGTCGTCAGGCAGCTTTGAA 
GCCCGTCTCTCCGGATGCCAAGCTTGTTGTGCCAATCCGGTACGTGGCGGCCGAGTTCGC 
TCACCTTTTTGAATCCTCGGACGCCATAACTAACAGCCCGTTTATGGAAGAGTATGCACT 
GCCTAAGGGCGGAGGGCCAAGAGTCCTCCAGGTACCGGATTGGCACTGGCCTACGTGCCC 
CAATTACCACATTAAAGATATCTGCACTGGCGTCCCCTCTTCTCGAGGACGAGGGTAAAA 
AAGCGCCATTGCACTAGGACTTACCGCGGAGACTGCCTCTGCTGGCCTACGTGCCAATCC 
GGTACGTGGCGTTGCATGTATTGCAGCCTCAGGGACGTCAGTGGATCATGAAGGTAGAGC 
ATGCGTCCTCTGCTGTTAAAATCTGAGTTCTGGACAAACTACCAATTGGCCACGTACCGG 
ATTGGCACGAAAGTATTGCCGGACAGCATCTTCCTTGCCTCAACATGTCGAACACAGTGG 
CACGATGCATGAGTCT 

>BCextension (synthesized by GeneWiz) 

CGCGCTAGCGGCAATACTTTCGTGCCAATCCGGTACGTGGGATATAGGATTGATTTTTCT 
TCACTGTACGCCGGAGAACATAAGAAGCGTCATAACCCTATCAGTCCCAAGAATAAGTCC 
CAGTATTAACTGTTTCAGCACCACGTACCGGATTGGCACGCTTGCAGAGAAATAGGGAAG 
GCTCATAGCGTTCGAAGTAGTTCCCAACCAGAACACTCCTGGTGAGTCGCTGGAAGCCAT 
CTATAGAAAGGACGACGCTCGGGCCGGTGTAGCCAAGCTTGTTGTGCCAATCCGGTACGT 
GGACAACGTCAACGTTAATCTCGTGCATCAGATCCTAGTTGGTTGAGAACAGAGCGCGAC 
GTGGGGCTGGGAGCCGATACCCGTAAGTAGGAGAGGGCAGCCCCAGGTACCGGATTGGCA 
CTGGCCTACGTGCGCCATCGCACCATTAAACATACCCGGATCGATATTTAGGTGTGCGGG 
GCTCCTGCATCAATTAATACATCGTTCGAACCAGACCCGTCCATTCATCTCGCCTGCTGG 
CCTACGTGCCAATCCGGTACGTGGGCGATTAATCAGCATCCAAGGTTGCACACCGGTTTA 
ATCGTGGGACATCGTCAAGCGTCAATATCCTAAAGACCTGCGAGGTTTAGGCATCAGTGC 
GGGGCCACGTACCGGATTGGCACGAAAGTATTGCCACCTCGAAAGTGACGATTGTTATTG 
ATCGGTTTGCACCCCGACCAGCCAGGCATCTCATCTCCACGGGGATCGAGGGCATCAATC 
ACATTTGACGGATCCGCG 

>Rci sequence (from NCBI Reference Sequence: NC013120.1 REGION: complement 
72912. ..74066) 

ATGCCGTCTCCACGCATCCGTAAAATGTCCCTGTCACGCGCACTGGATAAGTACCTGAAA 
ACAGTTTCTGTTCACAAGAAAGGGCATCAACAGGAGTTTTACCGGAGCAATGTTATCAAG 
CGATATCCCATTGCTCTTCGGAATATGGACGAAATAACAACCGTTGATATTGCTACATAC 
AGAGACGTTCGTTTAGCAGAAATAAACCCCCGAACGGGTAAAGCCATTACAGGTAATACT 
GTACGTCTTGAACTCGCCCTTCTGTCATCTCTGTTCAATATTGCTCGTGTTGAATGGGGA 
ACCTGTCGTACTAACCCGGTTGAACTGGTTCGCAAGCCGAAAGTATCCTCCGGACGAGAT 
CGCCGGCTAACGTCTTCAGAAGAACGTCGCCTTTCTCGCTATTTCCGCGAAAAAAATCTG 
ATGTTGTATGTCATTTTCCATCTTGCCCTTGAAACAGCCATGCGGCAGGGCGAAATACTG 
GCCTTACGTTGGGAGCACATTGATTTGCGCCACGGTGTGGCTCATTTACCTGAAACCAAA 
AACGGTCACTCACGGGATGTTCCTCTGTCCAGACGTGCCCGTAACTTTCTTCAAATGATG 
CCCGTTAATCTCCACGGCAATGTTTTTGATTACACCGCATCCGGCTTTAAAAATGCCTGG 
AGAATAGCCACACAACGACTTCGCATCGAGGACCTGCATTTTCACGATCTACGGCATGAA 
GCAATAAGCCGCTTCTTCGAACTGGGTAGCCTGAATGTAATGGAGATTGCTGCAATATCA 
GGACATCGTTCCATGAATATGCTGAAACGGTATACTCATCTTCGTGCATGGCAACTGGTC 
AGTAAGCTTGATGCCCGCCGGCGGCAGACACAAAAAGTGGCAGCATGGTTTGTGCCGTAT 
CCTGCCCATATCACGACTATCGATGAAGAAAATGGGCAGAAAGCGCATCGTATTGAGATC 
GGTGATTTTGATAACCTTCACGTCACTGCCACAACAAAAGAGGAAGCAGTTCACCGCGCC 
AGTGAGGTTTTGTTGCGTACACTGGCCATTGCAGCACAGAAAGGCGAACGTGTCCCATCT 
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CCCGGAGCGTTACCTGTTAACGACCCTGACTACATTATGATTTGCCCTCTGAACCCGGGC 
AGCACACCGCTGTAA 

Plasmid maps provided: 

All plasmids and associated maps are included here, and on Benchling.com: 
https://benchling.com/ipeikon/ipeikon published/ 

• IDP190: 5BC Cassette 

• IDP205:T7->Rci; 5BC Cassette 

• DIG35: pKat->Rci; 5BC Cassette 

• BCextension: 6 fragment BC extension 

• DIG70: T7->Rci; 11BC Cassette 

• DIG71: pKat->Rci; 11BC Cassette 

Description of Supplementary Files: 

• crecode.m : Simulates Cre recombination on a cassette where fragments are separated by two 
lox sites in opposing orientation 

• rcicode.m : Simulates Rci recombination on a cassette where fragments are separated by one 
sfx site in alternating orientation 

• rcicode2.m : Simulates Rci recombination on a cassette where fragments are separated by two 
sfx sites in opposing orientation 

• randDNA.m : Generates random DNA sequences for boostrapping SW alignment thresholds 

• procRCI.m : Processes data from Sanger sequencing to reconstruct barcodes 

• procRCI_PB.m : Processes data from PacBio sequencing to reconstruct barcodes 

Supplementary Notes 
Supplementary Note 1 

Simulations (see Methods for details) suggested that the Cre cassettes are subject to considerable biases. 
Specifically, the ends of barcode fragments are favored for retention (Supplementary Figure S1a). This is 
explained by the simple observation that there are more combinations of lox sites that, when acted on by 
Cre, will result in the excision of middle fragments (Supplementary Figure S1b). These inherent biases 
severely limit the practical diversities that can be generated with Cre-based cassettes. 
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Supplementary Figure 1. Biases of the Cre architecture (A) Simulated Cre recombination on 10,000 
cassettes of length n=100 reveals extreme biases for retaining end fragments. (B) There are many more 
pairs of lox sites that can lead to the excision of more central fragments. 

Supplementary Figure 2. 5BC Cassette stability. Sanger sequencing of the 5BC cassette after several 
generations of growth in bacterial cells shows no recombination of the cassette. 

Supplementary Figure 3. T7 induced Rci expression results in single recombination events. All of the 
reconstructed sequences resulting from shuffling by induced expression of Rci from the T7 promoter can 
be explained by a single recombination event. 

Supplementary Figure 4. 11BC Cassette stability. Sanger sequencing of the 11BC cassette after several 
generations of growth in bacterial cells shows no recombination of the cassette. 

Supplementary Figure 5. Cassettes approach complete randomness as the number of recombination 
events increase. (A) Simulated cassettes subjected to 5, 10, or 15 random recombination events. The 
colormaps show the distribution of fragment occupancy at each position in the cassette. Colors are scaled 
from 0 to 25%. (B) The bias at each position was calculated as the number of times the original fragment 
appeared in its original position divided by the number of cassettes. The dotted black line indicates the 
expected occupancy of the original fragment at each position in a cassette with completely random 
occupancy. 

Sequencing Data Files: 

• 5BC_CCS_IDP205.fastq 

o No Rci expression (T7->Rci; 5BC Cassette) 

• 5BC_CCS_DIG35.fastq 

o Rci expression (pKat->Rci; 5BC Cassette) 

• HBC_CCS_DIG71.fastq 

o Rci expression (pKat->Rci; 11BC Cassette) 
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Supplementary Figure 3 
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Supplementary Figure 4 
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Occurence of fragments at each position after m recombination events 
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